american english
Limit cycles for speech
Gafos, Adamantios I., Kuberski, Stephan R.
Rhythmic fluctuations in acoustic energy and accompanying neuronal excitations in cortical oscillations are characteristic of human speech, yet whether a corresponding rhythmicity inheres in the articulatory movements that generate speech remains unclear. The received understanding of speech movements as discrete, goal-oriented actions struggles to make contact with the rhythmicity findings. In this work, we demonstrate that an unintuitive -- but no less principled than the conventional -- representation for discrete movements reveals a pervasive limit cycle organization and unlocks the recovery of previously inaccessible rhythmic structure underlying the motor activity of speech. These results help resolve a time-honored tension between the ubiquity of biological rhythmicity and discreteness in speech, the quintessential human higher function, by revealing a rhythmic organization at the most fundamental level of individual articulatory actions.
Are Stereotypes Leading LLMs' Zero-Shot Stance Detection ?
Dubreuil, Anthony, Gourru, Antoine, Largeron, Christine, Trabelsi, Amine
Large Language Models inherit stereotypes from their pretraining data, leading to biased behavior toward certain social groups in many Natural Language Processing tasks, such as hateful speech detection or sentiment analysis. Surprisingly, the evaluation of this kind of bias in stance detection methods has been largely overlooked by the community. Stance Detection involves labeling a statement as being against, in favor, or neutral towards a specific target and is among the most sensitive NLP tasks, as it often relates to political leanings. In this paper, we focus on the bias of Large Language Models when performing stance detection in a zero-shot setting. We automatically annotate posts in pre-existing stance detection datasets with two attributes: dialect or vernacular of a specific group and text complexity/readability, to investigate whether these attributes influence the model's stance detection decisions. Our results show that LLMs exhibit significant stereotypes in stance detection tasks, such as incorrectly associating pro-marijuana views with low text complexity and African American dialect with opposition to Donald Trump.
Analyzing Dialectical Biases in LLMs for Knowledge and Reasoning Benchmarks
Pan, Eileen, Choi, Anna Seo Gyeong, ter Hoeve, Maartje, Seto, Skyler, Koenecke, Allison
Large language models (LLMs) are ubiquitous in modern day natural language processing. However, previous work has shown degraded LLM performance for under-represented English dialects. We analyze the effects of typifying "standard" American English language questions as non-"standard" dialectal variants on multiple choice question answering tasks and find up to a 20% reduction in accuracy. Additionally, we investigate the grammatical basis of under-performance in non-"standard" English questions. We find that individual grammatical rules have varied effects on performance, but some are more consequential than others: three specific grammar rules (existential "it", zero copula, and y'all) can explain the majority of performance degradation observed in multiple dialects. We call for future work to investigate bias mitigation methods focused on individual, high-impact grammatical structures.
Toward Responsible ASR for African American English Speakers: A Scoping Review of Bias and Equity in Speech Technology
Cunningham, Jay L., Adjagbodjou, Adinawa, Basoah, Jeffrey, Jawara, Jainaba, Kadoma, Kowe, Lewis, Aaleyah
This scoping literature review examines how fairness, bias, and equity are conceptualized and operationalized in Automatic Speech Recognition (ASR) and adjacent speech and language technologies (SL T) for African American English (AAE) speakers and other linguistically diverse communities. Drawing from 44 peer-reviewed publications across Human-Computer Interaction (HCI), Machine Learning/Natural Language Processing (ML/NLP), and Sociolinguistics, we identify four major areas of inquiry: (1) how researchers understand ASR-related harms; (2) inclusive data practices spanning collection, curation, annotation, and model training; (3) methodological and theoretical approaches to linguistic inclusion; and (4) emerging practices and design recommendations for more equitable systems. While technical fairness interventions are growing, our review highlights a critical gap in governance-centered approaches that foreground community agency, linguistic justice, and participatory accountability. We propose a governance-centered ASR life-cycle as an emergent interdisciplinary framework for responsible ASR development and offer implications for researchers, practitioners, and policymakers seeking to address language marginalization in speech AI systems.
Probing for Phonology in Self-Supervised Speech Representations: A Case Study on Accent Perception
Venkateswaran, Nitin, Tang, Kevin, Wayland, Ratree
Traditional models of accent perception underestimate the role of gradient variations in phonological features which listeners rely upon for their accent judgments. We investigate how pretrained representations from current self-supervised learning (SSL) models of speech encode phonological feature-level variations that influence the perception of segmental accent. We focus on three segments: the labiodental approximant, the rhotic tap, and the retroflex stop, which are uniformly produced in the English of native speakers of Hindi as well as other languages in the Indian sub-continent. We use the CSLU Foreign Accented English corpus (Lander, 2007) to extract, for these segments, phonological feature probabilities using Phonet (Vásquez-Correa et al., 2019) and pretrained representations from Wav2Vec2-BERT (Barrault et al., 2023) and WavLM (Chen et al., 2022) along with accent judgements by native speakers of American English. Probing analyses show that accent strength is best predicted by a subset of the segment's pretrained representation features, in which perceptually salient phonological features that contrast the expected American English and realized non-native English segments are given prominent weighting. A multinomial logistic regression of pretrained representation-based segment distances from American and Indian English baselines on accent ratings reveals strong associations between the odds of accent strength and distances from the baselines, in the expected directions. These results highlight the value of self-supervised speech representations for modeling accent perception using interpretable phonological features.
Disparities in LLM Reasoning Accuracy and Explanations: A Case Study on African American English
Zhou, Runtao, Wan, Guangya, Gabriel, Saadia, Li, Sheng, Gates, Alexander J, Sap, Maarten, Hartvigsen, Thomas
Large Language Models (LLMs) have demonstrated remarkable capabilities in reasoning tasks, leading to their widespread deployment. However, recent studies have highlighted concerning biases in these models, particularly in their handling of dialectal variations like African American English (AAE). In this work, we systematically investigate dialectal disparities in LLM reasoning tasks. We develop an experimental framework comparing LLM performance given Standard American English (SAE) and AAE prompts, combining LLM-based dialect conversion with established linguistic analyses. We find that LLMs consistently produce less accurate responses and simpler reasoning chains and explanations for AAE inputs compared to equivalent SAE questions, with disparities most pronounced in social science and humanities domains. These findings highlight systematic differences in how LLMs process and reason about different language varieties, raising important questions about the development and deployment of these systems in our multilingual and multidialectal world. Our code repository is publicly available at https://github.com/Runtaozhou/dialect_bias_eval.
The "negative end" of change in grammar: terminology, concepts and causes
The topic of "negative end" of change is, contrary to the fields of innovation and emergence, largely under-researched. Yet, it has lately started to gain an increasing attention from language scholars worldwide. The main focus of this article is threefold, namely to discuss the i) terminology; ii) concepts and iii) causes associated with the "negative end" of change in grammar. The article starts with an overview of research conducted on the topic. It then moves to situating phenomena referred to as loss, decline or obsolescence among processes of language change, before elaborating on the terminology and concepts behind it. The last part looks at possible causes for constructions to display a (gradual or rapid, but very consistent) decrease in the frequency of use over time, which continues until the construction disappears or there are only residual or fossilised forms left.
Variation of sentence length across time and genre
The goal of this paper is threefold: i) to present some practical aspects of using full-text version of Corpus of Historical American English (COHA), the largest diachronic multi-genre corpus of the English language, in the investigation of a linguistic trend of change; ii) to test a widely held assumption that sentence length in written English has been steadily decreasing over the past few centuries; iii) to point to a possible link between the changes in sentence length and changes in the English syntactic usage. The empirical proof of concept for iii) is provided by the decline in the frequency of the non-finite purpose subordinator in order to. Sentence length, genre and the likelihood of occurrence of in order to are shown to be interrelated.
Can Grammarly and ChatGPT accelerate language change? AI-powered technologies and their impact on the English language: wordiness vs. conciseness
The proliferation of NLP-powered language technologies, AI-based natural language generation models, and English as a mainstream means of communication among both native and non-native speakers make the output of AI-powered tools especially intriguing to linguists. This paper investigates how Grammarly and ChatGPT affect the English language regarding wordiness vs. conciseness. A case study focusing on the purpose subordinator in order to is presented to illustrate the way in which Grammarly and ChatGPT recommend shorter grammatical structures instead of longer and more elaborate ones. Although the analysed sentences were produced by native speakers, are perfectly correct, and were extracted from a language corpus of contemporary English, both Grammarly and ChatGPT suggest more conciseness and less verbosity, even for relatively short sentences. The present article argues that technologies such as Grammarly not only mirror language change but also have the potential to facilitate or accelerate it.
Finding A Voice: Evaluating African American Dialect Generation for Chatbot Technology
Finch, Sarah E., Paek, Ellie S., Kwon, Sejung, Choi, Ikseon, Wells, Jessica, Chandler, Rasheeta, Choi, Jinho D.
As chatbots become increasingly integrated into everyday tasks, designing systems that accommodate diverse user populations is crucial for fostering trust, engagement, and inclusivity. This study investigates the ability of contemporary Large Language Models (LLMs) to generate African American Vernacular English (AAVE) and evaluates the impact of AAVE usage on user experiences in chatbot applications. We analyze the performance of three LLM families (Llama, GPT, and Claude) in producing AAVE-like utterances at varying dialect intensities and assess user preferences across multiple domains, including healthcare and education. Despite LLMs' proficiency in generating AAVE-like language, findings indicate that AAVE-speaking users prefer Standard American English (SAE) chatbots, with higher levels of AAVE correlating with lower ratings for a variety of characteristics, including chatbot trustworthiness and role appropriateness. These results highlight the complexities of creating inclusive AI systems and underscore the need for further exploration of diversity to enhance human-computer interactions.